Book Review: Introduction to Information Retrieval by Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze
نویسنده
چکیده
Introduction to Information Retrieval by Manning, Raghavan, and Schütze is an up-to-date, thorough, and systematic introduction to information retrieval (IR) from a computer science perspective. Written as a textbook, its main audience is graduate and senior undergraduate students taking IR courses. The book will also be valuable to researchers in other computer science fields, such as computational linguistics, as well as to professional practitioners wishing to delve into the IR field. The book is structured into 21 chapters, which gradually unfold the subject of information retrieval, starting with the fundamentals (such as Boolean retrieval, document indexing, vector-space model, and evaluation in IR) and moving on to more advanced topics (such as probabilistic models, XML retrieval, text classification, machine learning for IR, document clustering, and Web retrieval). Pedagogical features of the book include short exercises at the end of each section and brief overviews of related research literature at the end of each chapter. Some of the major strengths of the book are its accessibility, clarity, and good balance between theory and practice. There are many concrete examples throughout the book that facilitate understanding of complex topics. Although the book covers a broad selection of the major established and emerging topics in IR, it largely bypasses two important subjects, in my opinion: natural language processing techniques in IR and interactive information retrieval. Although the authors refer to some research done in these areas in various chapters, they do not give them the same thorough treatment given to other topics in the book. To compensate, in the preface the authors provide references to the detailed coverage of these and some other topics in other textbooks. It also might have been useful if the authors introduced some specialized IR tasks, such as opinion retrieval or enterprise search, which might benefit from more advanced NLP techniques. Chapter 1 gives a succinct and focused introduction to the main concepts in IR, such as term, index, document, query, recall, precision, and so on. It outlines the main principles of Boolean retrieval, briefly criticizes it, and compares it to ranked retrieval. The authors also present a good real-world example of a commercial Boolean retrieval system. Chapter 2 provides a detailed discussion of the initial stages of the document indexing process that include tokenization, stemming and lemmatization, stopwords removal, and approaches to dealing with phrases at the indexing stage, namely bi-gram indexing and the use of positional indexes. …
منابع مشابه
Introduction to information retrieval
Introduction to Information Retrieval is the first textbook with a coherent treatment of classical and web information retrieval, including web search and the related areas of text classification and text clustering. Written from a computer science perspective, it gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching docu...
متن کاملInformation Retrieval System Assigning Context to Documents by Relevance Feedback
In this paper we have proposed user feedback driven Information retrieval model. The proposed model assigns weights to the retrieved documents based on its context. The documents are re-ranked based on the user profile and his feedback. Proposed Information retrieval system uses vector space model and expert system. Need for user profile and relevance of information while searching and extracti...
متن کاملKeyword based Automatic Summarization of HTML Documents
Automatic summarization [5] can be defined as the procedure to create a short version of a text by a computer program. Its product still contains the most important points of the existing text. Multi-document summarization [6] can be defined as an automatic procedure which extracts information from multiple texts that is written about the same topic. Resulting summary report allows individual u...
متن کاملBook Review: Cross-Language Information Retrieval by Jian-Yun Nie
Cross-Language Information Retrieval is a compact book introducing a branch of information retrieval that has gained considerable research interest since the dawn of the WorldWideWeb in the mid 1990s. Information retrieval is generally concerned with the problem of finding documents within a large collection that are relevant to a given input query. Whereas the original formulation of IR assume...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009